Associations of cardiovascular risk factors and lifestyle behaviors with neurodegenerative disease: a Mendelian randomization study

Previous observational studies reported that midlife clustering of cardiovascular risk factors and lifestyle behaviors were associated with neurodegenerative disease; however, these findings might be biased by confounding and reverse causality. This study aimed to investigate the causal associations of cardiovascular risk factors and lifestyle behaviors with neurodegenerative disease, using the two-sample Mendelian randomization design. Genetic variants for the modifiable risk factors and neurodegenerative disease were extracted from large-scale genome-wide association studies. The inverse-variance weighted method was used as the main analysis method, and MR-Egger regression and leave-one-out analyses were performed to identify potential violations. Genetically predicted diastolic blood pressure (DBP: OR per 1 mmHg, 0.990 [0.979–1.000]), body mass index (BMI: OR per 1 SD, 0.880 [0.825–0.939]), and educational level (OR per 1 SD, 0.698 [0.602–0.810]) were associated with lower risk of late-onset Alzheimer’s disease (LOAD), while genetically predicted low-density lipoprotein (LDL: OR per 1 SD, 1.302 [1.066–1.590]) might increase LOAD risk. Genetically predicted exposures (including LDL and BMI) applied to familial AD showed the same effect. The association of LDL was also found with Amyotrophic lateral sclerosis (ALS) (LDL: OR per 1 SD, 1.180 [1.080–1.289]). This MR analysis showed that LDL, BMI, BP, and educational level were causally related to AD; a significant association between LDL and ALS risk, as well as the potential effect of sleep duration on PD risk, were also revealed. Targeting these modifiable factors was a promising strategy of neurodegenerative disease prevention.


INTRODUCTION
There is an increasing prevalence of neurodegenerative disease in the global aging population [1], which places a major economic burden on healthcare services. In addition to age, certain environmental or lifestyle factors combined with genetic factors increase the risk of neurodegenerative disease [2]. Since neurodegenerative disease is incurable, a preventive strategy targeting its risk factors is warranted.
Previous epidemiological studies and meta-analyses identified potentially modifiable risk factors that could be targeted in preventive measures to reduce the incidence of Alzheimer's disease (AD), the most common neurodegenerative disease [3]. These factors consist of cardiovascular risk factors, such as metabolic diseases [4] and blood pressure [5], lifestyle behaviors including smoking [6], drinking [7], and sleeping [6,8], as well as an educational level [6]. Other neurodegenerative diseases, including Parkinson's disease (PD) and amyotrophic lateral sclerosis (ALS), were reported as sharing some common contributing factors with AD, such as smoking, drinking, physical activity, body mass index (BMI), blood pressure (BP), and so on [9]. Thus, modifying cardiovascular risk factors and lifestyle behaviors may hold promise for reducing the burden of neurodegenerative disease. However, findings from observational studies may be influenced by reverse causation bias and unmeasured confounding factors that can obscure the true association. Besides, difficulties in conducting large-scale randomized clinical trials (RCTs) also restrict the exploration of this association. These limitations call for the use of alternative methods, for instance, those that target causal effects and use complementary data sources, rather than single cohorts.
Mendelian randomization (MR) is a novel method that can estimate the causal effects of risk factors in observational studies by using genetic variants to provide evidence of robust associations and incorporating summary statistics from largecohort genome-wide association studies (GWASs) [10]. Previous studies using MR have explored the causal effects of some modified lifestyle behaviors, such as sleep characteristics, on AD [11]. However, they reached inconsistent results, possibly due to the differences in the number of single nucleotide polymorphisms (SNPs) included, the statistical power, collider bias, and so on; thereby, higher-quality studies with larger sample sizes are urgently needed to resolve the issues. A recent study explored the associations between midlife vascular risk factors and the risk of incident dementia [12], but it only adopted all-cause dementia as the outcome since AD and vascular dementia may not share the same risk factors.
Herein, we aimed to perform a two-sample MR analysis to assess the causal effects of genetically determined cardiovascular risk factors and lifestyle behaviors on the risk of LOAD as well as familial AD (including paternal AD and maternal AD) comprehensively. In spite of the diversity of clinical symptoms of neurodegenerative disease, their pathogenesis was linked by: the misfolding of proteins in specific brain regions; significant neuroinflammation and excessive oxidative stress in those areas [13]. To be a comparison, we also evaluated the causal association of cardiovascular risk factors and lifestyle behaviors with the risk of other two neurodegenerative diseases, including PD and ALS.

Two-sample MR design
In this study, we adopted the two-sample MR design (Fig. 1), a genetic instrumental variable analysis based on summary-level data with single nucleotide polymorphisms (SNPs) as instruments for the risk factor. This analytical approach minimizes the influence of confounding and reverse causation bias, since SNPs are randomly allocated at conception. To obtain valid instrumental variables, it is essential that the MR assumptions hold. These assumptions include: (1) the SNPs are associated with the exposure; (2) the SNPs are independent of confounders of the risk factor-outcome association; and (3) the SNPs influence the outcome only via the exposure [14].

Data source and single nucleotide polymorphism selection
We used GWASs conducted primarily among individuals of European ancestry to identify the genetic instruments of the modifiable risk factors. Since GWAS, with a small sample size and limited statistical power, might fail to detect SNP-trait associations [15], we only kept datasets with N > 50,000 and both cases and controls are >10,000 for binary phenotypes. An overview of these data sources was presented in Table 1. From all the identified variants in each gene, only SNPs that were significantly associated with exposure factors (P < 5 × 10 −8 ) and clumped to a linkage disequilibrium (LD) threshold of r 2 < 0.001 were considered candidate proxies.  (1) the selected instrument is predictive of the exposure, (2) the instrument is independent of confounding factors, and (3) there is no horizontal pleiotropy (the instrument is associated with the outcome only through the exposure). B The study design overview of the current study. MR Mendelian randomization, GWAS genome-wide association studies, AD Alzheimer's disease, PD Parkinson's disease, ALS amyotrophic lateral sclerosis, LD linkage disequilibrium, SNP single nucleotide polymorphism. 16

Statistical analyses
The inverse-variance weighted (IVW) method with a multiplicative randomeffects model was used as the main analysis method. Since this method might be affected by pleiotropy or invalid instrument bias in case not all MR assumptions hold, we tested the validity and robustness of the results by conducting several sensitivity analyses: MR-Egger, weighted median, simple mode, and weighted mode method. The weighted median method, which is used to check invalid instrument bias, provides a consistent estimate even if over 50% of the information comes from invalid or weak instruments [16]. The estimate of the causal effect provided by the MR-Egger method is less susceptible to the presence of pleiotropy; therefore, the MR-Egger method is better in the presence of pleiotropy [10]. We estimated the intercept of MR-Egger regression, which represented the average horizontal pleiotropy. We also conducted a leave-one-SNP-out analysis to assess the influence of potentially pleiotropic SNPs on the causal estimates by systematically removing one SNP at a time. The strength of the genetic instrument was estimated using F-statistics. If F-statistics are greater than 10, the instrument strength is sufficient for MR analysis [17].
The principal statistical analyses were conducted using R (version 4.2.2) and MR analyses were conducted using the "TwoSampleMR" package. We also performed additional MR analyses by the MRlap, an R-package to perform MR analyses permitting overlapping samples (Additional file 7). Results were reported as odds ratios (OR) with corresponding 95% CIs. Statistical significance was determined by a two-tailed P value <0.05.

Additional analyses
The causal evidence by the "TwoSampleMR" package did not change its significance in the additional analyses. We further discovered the causal relationship between SBP and LOAD as well as familial AD. Besides, short sleep duration also suggested significance in correlation with PD. And then, the additional MR analyses by the "MRlap" package drew another four potential risk factors, including hypertension, DBP, diabetes, and 2hGlu. Further details were exhibited in Table 2. Fig. 2 The associations of cardiovascular risk factors and lifestyle behaviors with AD risk. Genome-wide significantly associated (P < 5 × 10 −8 ) independent (linkage disequilibrium r 2 = 0.001, clumping distance = 10,000 kb) SNPs were used as instruments. AD Alzheimer's disease, SNP single nucleotide polymorphism, HDL high-density lipoprotein cholesterol, LDL low-density lipoprotein cholesterol, SBP systolic blood pressure, DBP diastolic blood pressure, PP pulse pressure, HbAlc glycosylated hemoglobin, 2 h Glu 2-hour post-meal blood glucose. *, 0.01 < p < 0.05; **, 0.001 < p < 0.01; ***, p < 0.001.

DISCUSSION
In this two-sample MR study, we found that both LOAD and familial AD had significant associations with genetically determined cardiovascular risk factors and lifestyle behaviors, including educational level, BMI, plasma lipids, and BP categories. Furthermore, we found the causal associations of dyslipidemia and LDL with ALS, as well as the potential associations of sleep duration and smoking initiation with PD and ALS, respectively. However, all these results should be interpreted with great caution.
There was accumulating observational evidence that a high educational level may protect against AD [18]. Besides, educational level was identified as a protective factor for cognitive deterioration related to other risk factors [19]. For one thing, the current study also showed a protective effect of high educational levels on AD risk. The polymorphism rs9320913 (closest gene = MMS22L), which was genome-wide and significantly associated with educational level, might indirectly involve in processes contributing to cognitive reserve for its function in neuron apoptosis and neuroinflammation [20]. Therefore, as a proxy of cognitive reserve, a high educational level might show resilience against cognitive decline even in the presence of neuropathology [21]. For another, one MR study reported a bidirectional association between intelligence and educational level, so prior intelligence might also mediate the association between educational level and AD risk [22]. Together, these findings suggested the potential of reducing AD risk by improving various aspects of cognitive reserve (e.g., with cognitive training), which might compensate for lower educational attainment.
As a genetic instrument of BMI, rs17125944 (closest gene = FERMT2) was also associated with AD risk. Therefore, BMI and AD Fig. 3 The associations of cardiovascular risk factors and lifestyle behaviors with PD risk. Genome-wide significantly associated (P < 5 × 10 −8 ) independent (linkage disequilibrium r 2 = 0.001, clumping distance = 10,000 kb) SNPs were used as instruments. PD Parkinson's disease, SNP single nucleotide polymorphism, HDL high-density lipoprotein cholesterol, LDL low-density lipoprotein cholesterol, SBP systolic blood pressure, DBP diastolic blood pressure, PP pulse pressure, HbAlc glycosylated hemoglobin, 2 h Glu 2-hour post-meal blood glucose. *, 0.01 < p < 0.05. Fig. 4 The associations of cardiovascular risk factors and lifestyle behaviors with ALS risk. Genome-wide significantly associated (P < 5 × 10 −8 ) independent (linkage disequilibrium r 2 = 0.001, clumping distance = 10,000 kb) SNPs were used as instruments. ALS amyotrophic lateral sclerosis, SNP single nucleotide polymorphism, HDL high-density lipoprotein cholesterol, LDL low-density lipoprotein cholesterol, SBP systolic blood pressure, DBP diastolic blood pressure, PP pulse pressure, HbAlc glycosylated hemoglobin, 2 h Glu 2-hour postmeal blood glucose. *, 0.01 < p < 0.05; **, 0.001 < p < 0.01; ***, p < 0.001. might be linked by the dysfunction of scaffolding protein. Using the MR approach, the current study proved the protective effect of BMI on AD dementia, which were inconsistent with that of another MR study [23]. Mukherjee and colleagues used the genome-wide significant SNPs and the associated β weights from the published meta-analysis (249,796 individuals included) to construct polygenic scores. Though no evidence from individual SNPs or polygenic scores indicated BMI increased AD risk, the effect estimates suggested a potential protective role of BMI in AD risk. Based on Mukherjee's work, we updated the MR analyses using the summary data from a recent meta-analysis with a bigger sample (over 1 million participants) and further confirmed the protective role of BMI in AD risk. The hormone leptin, mainly secreted by the adipose tissue, might work as a cognitive enhancer. In animal models, leptin was found to enhance adult neurogenesis and reduce pathological features by modulating the formation of senile plaque, as well as attenuating Aβ-induced neurodegeneration and superoxide anion production [24]. Besides, in clinical practice, lower levels of leptin were reported as a risk factor for developing AD after a 12-year follow-up by the Framingham study [25]. However, studies showed higher BMI in midlife increased dementia risk [26], while late-life BMI might exert the opposite effect [27]. Hence, BMI might have age-depended effects on AD risk. In addition, the true association between BMI and AD risk might also be biased by selection bias and epigenetics. As a result, the spurious association between BMI and AD could occur when competing risks and epigenetics of the outcome existed.
Dyslipidemia was implicated as a risk factor for AD [28]. Genetic enrichment in AD was predominantly related to plasma lipids, such as rs3844143 (closest gene = PICALM) [29], but the mechanism by which LDL modulates the risk for AD remained elusive. A high level of LDL in the plasma affected the flux of oxidized metabolite 27-hydroxycholesterol (27OH) from circulation into the brain. And then, the excessive accumulation of 27OH in the brain leads to the elevated deposition of Aβ, a key initiating event in AD [30]. In addition, a high plasma level of LDL was implicated in the impairment of the blood-brain barrier (BBB) [31], which enabled circulating LDL to enter the brain and executed a direct effect on the pathogenesis of AD, further promoting Aβ deposition [32]. However, this MR analysis showed evidence of horizontal pleiotropy in the relationship between LDL and AD, and therefore we could not ignore the effects of confounders. Many studies have demonstrated that high concentrations of LDL are associated with coronary heart disease and carotid artery atherosclerosis, which, in turn, may lead to cognitive decline through cerebral embolism or hypoperfusion [33].
Using neuroimaging, previous prospective studies showed significant associations between increased BP with smaller brain volumes [34] as well as increased Aβ brain burden [35]. All the above evidence indicated that BP might affect cognitive function by regulating brain volume and Aβ deposition. However, several previous studies indicated that higher AD polygenic risk scores (PRS) were associated with lower BP [36], which was confirmed by the current study demonstrating an inverse association between BP and the risk of AD using MR. The inconsistency might be Table 2. Associations of cardiovascular risk factors and lifestyle behaviors with AD, PD, and ALS (by "MRlap" package).

Exposures
AD_IVW beta (se) Paternal AD_IVW beta (se) Maternal AD_IVW beta (se) PD_IVW beta (se) ALS_IVW beta (se) caused by the following reasons: antihypertensive medications might cover up the real association between BP and the risk of AD since calcium channel blocker (CCB) was identified as a promising strategy for AD prevention [37]; BP might be positively relevant for AD risk only above a certain threshold, that is, BP served as a double-edged sword; like BMI, BP might also have age-depended effects on AD risk; the true correlation between BP and AD risk could be confused by BP-related cardiovascular diseases, which served as competing risks. Both dyslipidemia and LDL accounted for the ALS risk in this MR analysis; the results were in line with a more than 20-year follow-up study [38]. LDL was the major carrier of cholesterol in the peripheral circulation. Excess cholesterol was metabolized to a more soluble form, oxysterols. Oxysterols could readily cross the barrier and the increased intracellular oxysterol levels eroded cell viability, especially neuronal cells [39]. BBB and blood-spinal cord barrier (BSCB) impairments were reported in ALS patients and SOD-1 mouse models [40,41]. Thereby, we hypothesized that oxysterols might involve in oxidative stress, resulting in the dysfunction of neurons and the destruction of BBB and BSCB.
We also found a positive association between educational level and PD risk; the finding contradicts the motor reserve theory that high educational levels protect against PD by exerting a protective effect on white matter integrity [42]. Since the educational level is influenced by both genetic and familial environmental factors, future longitudinal co-twin control studies may explain the conflicting results. Using the "MRlap" package, a short sleep duration and long-sleep duration were identified as the potential risk factors for PD, which revealed a non-linear association of sleep duration with the risk of PD. The sleep-wake cycle was reported to increase extracellular levels of tau and alpha-synuclein [43]. Besides, brain autopsy revealed that increased actigraphyderived sleep fragmentation in old subjects without PD was associated with an increased burden of PD pathology [44]. All these indicated that potential pathways between sleep duration and PD might include increased oxidative stress, or reduced clearance of extracellular alpha-synuclein. Interestingly, another study demonstrated that the rs2028122 genotype partially mediated the causal pathway of sleep duration, leading to the development of PD on a positive effect [45]; therefore, the association between sleep duration and PD risk might vary across different rs2028122 genotypes.
To sum up, we confirmed a set of cardiovascular risk factors and lifestyle behaviors leading to neurodegenerative diseases in the current MR study. All the identified risk factors indicated common underlying mechanisms among neurodegenerative diseases: neuroinflammation and excessive oxidative stress. Since BBB was the main protective barrier of the central nervous system (CNS), an increased BBB permeability could disbalance the homeostasis and affect innate and adaptative immune responses. When optimal communication between the brain and systemic immune system fails to occur appropriately, inflammatory mediators accumulate in the CNS in a process known as neuroinflammation and trigger brain damage [46]. Besides, the CNS had a high metabolic rate with favoring free radical formation. Under oxidative stress conditions, dysfunctional mitochondria fail to produce the high energy levels required by neuronal cells to perform their normal biochemical and physiological functions, leading to rapid cell death [47]. Neuroinflammation, oxidative stress, and mitochondrial dysfunction lead to aggregation of misfolded protein (Aβ and tau, α-Syn, and TDP-43 [TAR DNA-binding protein] and SOD-1[superoxide dismutase] are the proteins involved in AD, PD, and ALS, respectively) which could trigger each disease [48]. Therefore, preventing or minimizing neuroinflammation throughout all stages of life might proactively and cumulatively reduce the risk of developing neurodegenerative diseases.
Several limitations should be considered. First, although genetic variants were derived from studies with relatively large sample sizes, our finding might still be affected by weak instrument bias. Subsequent studies need to struck a balance between including fewer variants (potentially having insufficient power) and including more variants (potentially including pleiotropic variants). Second, we couldn't exclude the possibility of inflating the type 1 error rate since there were overlaps between exposure GWASs and outcome GWASs. In addition, most GWAS studies recruited in middle-to-old age, in which participants were inevitably survivors of the genetic instruments. As a result, the effect estimates might be distorted in survivor bias and selection bias. Third, A growing body of evidence suggested that the impact of environmental influences might extend beyond the DNA sequence, so epigenetic bias could distort the effects detected in our MR study. Future two-step epigenetic MR study, with a further understanding of the causal role of epigenetics (such as DNA methylation) in mediating environmental influences on common complex disease, would overcome the potential for confounding and reverse causation. Fourth, the explanation concerning the current MR-based results by amyloid hypothesis required further investigation, since still other studies yielded null findings regarding amyloid and AD [49]. Future progress in progress in neuroimaging might help to figure out the uncertain mechanisms. Last, our population was limited to individuals of European ancestry, which might limit the generalizability of our findings to other ethnic groups.
In this two-sample MR study, we found that LDL, BMI, BP, and educational level were causally related to AD. Besides, we also observed a significant association between LDL and ALS risk as well as the potential effect of sleep duration on PD risk. All these results implied that targeting these modifiable factors could facilitate the prevention of neurodegenerative disease.

DATA AVAILABILITY
All the data used in this study can be acquired from the original genome-wide association studies that are mentioned in the text or in its additional files. Any other data generated in the analysis process can be requested from the corresponding author.